Indexing with WordNet Synonyms May Improve Retrieval Results

نویسندگان

  • Davide Buscaldi
  • Paolo Rosso
چکیده

This paper describes a method developed for the Robust Word Sense Disambiguation task at CLEF 2009. In our approach, a WordNet expanded index is generated from the disambiguated document collection. This index contains synonyms, hypernyms and holonyms of the disambiguated words contained in documents. Query words are integrated by terms extracted by means of a pseudo relevance feedback technique. The set of terms made of query words and terms resulting from pseudo relevance feedback are searched for in both the expanded WordNet index and the default index. The results show that the use of the extended index did not prove useful, obtaining 14−16% less in MAP with respect to the base system. However, for some queries, expanding index terms with synonyms resulted particularly useful.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A WordNet-Based Indexing Technique for Geographical Information Retrieval

This paper presents an indexing technique based on WordNet synonyms and holonyms. This technique has been developed for the Geographical Information Retrieval task. It may help in finding implicit geographic information contained in texts, particularly if the indication of the containing geographical entity is omitted. Our experiments were carried out with the Lucene search engine over the GeoC...

متن کامل

A WordNet-based Query Expansion Method for Geographical Information Retrieval

This report describes a query expansion method based on the expansion of geographical terms by means of WordNet synonyms and meronyms. We used this method for our participation to the GeoCLEF 2005 English monolingual task, while using the well-known Lucene search engine for indexing and retrieval. The obtained results show that the proposed method was not suitable for the GeoCLEF track, while W...

متن کامل

The UPV at GeoCLEF 2008: The GeoWorSE System

This year our system was complemented with a map-based filter. During the indexing phase, all places are disambiguated and assigned their coordinates on the map. These coordinates are stored in a separate index. The search process is carried out in two phases: in the first one, we search the collection with the same method applied in 2007, which exploits the expansion of index terms by means of...

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

Indexing with WordNet synsets can improve text retrieval

The classical, vector space model for text retrieval is shown to give better results (up to 29% better in our experiments) if WordNet synsets are chosen as the indexing space, instead of word forms. This result is obtained for a manually disambiguated test collection (of queries and documents) derived from the Semcor semantic concordance. The sensitivity of retrieval performance to (automatic) ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009